12 research outputs found

    Experiments in Spoken Document Retrieval at CMU

    No full text
    We describe our submission to the TREC-7 Spoken Document Retrieval (SDR) track and the speech recognition and the information retrieval engines. We present SDR evaluation results and a brief analysis. A few developments are also described in greater detail including: . A new, probabilistic retrieval engine based on language models

    ANSES: summarisation of news video

    No full text
    We describe the Automatic News Summarisation and Extraction System (ANSES), which captures television news each day with the accompanying subtitles and identifies and extracts news stories from the video. Lexical chain analysis is used to provide a summary of each story and important entities are highlighted in the text

    Experiments in Spoken Document Retrieval at CMU

    No full text
    We describe our submission to the TREC-6 Spoken Document Retrieval (SDR) track and the speech recognition and the information retrieval engines. We present SDR evaluation results and a brief analysis. A few developments and experiments are also described in detail including: . Vocabulary size experiments, which assess the effect of words missing from the speech recognition vocabulary. For our 51,000-word vocabulary the effect was minimal

    Linking Everything to Everything: Journal Publishing Myth or Reality?

    No full text
    Reference lists are an important facet of the modern academic journal. This form of 'hyperlinking' becomes enormously more powerful when translated to the World Wide Web, both in terms of the speed of link following and in the number of linked documents that can be made accessible. Electronic Press Ltd (EP), one of the first commercial publishers to commit to electronic publishing on the Web, plans to extend the practice of citation linking, aiming to link a document not just to a cited source but to all other documents that contain relevant information. Relevance in this case is defined as all referring, or referred to, documents. The paper discusses EPÂ?s approach to link creation on this scale, which is based on an internalised system. One way of extending this approach is to support link creation as well as link following on a distributed network such as the Web. The Open Journal Project has built some first demonstrations, which are outlined in the paper. The convergence between these two approaches suggests some important new motivations for online journal publishing. Some of these features will eventually transform journal usage

    Towards Universal Linking for Electronic Journals

    No full text
    Reference lists are an important facet of the modern academic journal. This form of 'hyperlinking' becomes enormously more powerful when translated to the World Wide Web, both in terms of the speed of link following and in the number of linked documents that can be made accessible. Electronic Press Ltd (EP), one of the first commercial publishers to commit to electronic publishing on the Web, plans to extend the practice of citation linking, aiming to link a document not just to a cited source but to all other documents that contain relevant information. Relevance in this case is defined as all referring, or referred to, documents. The paper discusses EP?s approach to link creation on this scale, which is based on an internalised system. One way of extending this approach is to support link creation as well as link following on a distributed network such as the Web. The Open Journal Project has built some first demonstrations, which are outlined in then paper. The convergence between these two approaches suggests some important new motivations for online journal publishing. Some of these features will eventually transform journal usage

    Experiments In Information Retrieval From Spoken Documents

    No full text
    This paper describes the experiments performed as part of the TREC-97 Spoken Document Retrieval Track. The task was to pick the correct document from 35 hours of recognized speech documents, based on a text query describing exactly one document. Among the experiments we described here are: Vocabulary size experiments to assess the effect of words missing from the speech recognition vocabulary; experiments with speech recognition using a stemmed language model; using confidence annotations that estimate of the correctness of each recognized word; using multiple hypotheses from the recognizer. And finally we also measured the effects of corpus size on the SDR task. Despite fairly high word error rates, information retrieval performance was only slightly degraded for speech recognizer transcribed documents. 1. INTRODUCTION For the first time, the 1997 Text REtrieval Conference (TREC97) included an evaluation track for information retrieval on spoken documents. In this paper, we describe ..

    Towards LarKC: a Platform forWeb-scale Reasoning

    Get PDF
    Current Semantic Web reasoning systems do not scale to the requirements of their hottest applications, such as analyzing data from millions of mobile devices, dealing with terabytes of scientific data, and content management in enterprises with thousands of knowledge workers. In this paper, we present our plan of building the Large Knowledge Collider, a platform for massive distributed incomplete reasoning that will remove these scalability barriers. This is achieved by (i) enriching the current logic-based Semantic Web reasoning methods, (ii) employing cognitively inspired approaches and techniques, and (iii) building a distributed reasoning platform and realizing it both on a high-performance computing cluster and via "computing at home". In this paper, we will discuss how the technologies of LarKC would move beyond the state-of-the-art of Web-scale reasoning
    corecore